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ABBREVIATED  PSYCHOLOGICAL  MEASURES  ■ 


Ivan  N.  Mtnsh,  Washington  University  Medical  School,  and 
William  A.  Hunt  , Northwestern  University 

INTRODUCTION 

Tha  peasant  military  emergency  onca  again  has  introduced  praising  profatanw  vi  ma.. power  mo- 
bilisation, among  thair  the  selection  of  personnel.  World  War  !l  first  high-lighted  tha  u'  j of  abbrevi- 
ated psychological  tests  in  military  selection  procedures,  although  test  abbreviation  djitas  back  to  the 
pea- World  War  I period  <571  and  to  DoS's  pioesar  v erk  (17)  35  years  ago.  indeed,  fhn  psychological 
’test,"  first  devised  by  Galton,  was  introduced  as  "an  experimental  method  of  measurement  . . . 
characterized  by  its  brevity,"  but  not  until  World  War  II  was  there  any  large  scale  development  and 
application  (66)  of  abbreviated  psychological  measures.  In  tha  past  dacada,  one  of  tho  authors  and  hi* 
co-workers  (40)  recognizing  the  utility  of  abbreviated  techniques  f~  many  •iivartiom,  particularly  during 
periods  of  rapid  mobilisation  when  inadequate  numbers  of  trained  personnel  ere  available  for  screening 
military  recruits,  have  developed  and  evaluated  many  series  of  brief  tests. 

Beginning  with  Dofi's  report  ir.  1417  end  in  the  more  then  20  veers  prior  to  1940,  there  had 
uumn  published  (65)  but  36  articles  on  abbreviated  techniques  of  psycWogical  measurement.  During 
the  recent  war  years  and  in  the  immediate  post-war  period  about  160  srticles  in  this  area  appeared, 
and  at  presant  there  are  available  nearly  double  that  number  of  rapert*.  Those  have  increased  our 
fund  of  information  about  abbreviated  tests,  and  elso  have  helped  to  sharpen  our  focus  on  the  prob- 
lems arising  from  their  use.  The  need  for  brief  or  abbreviated  measures  is  seen  (65,  1 1 2)  in  many  sit- 
uations. civilian  and  military,  ranging  from  application  in  the  neuropsychiatric  screening  of  military  re- 
cruits to  use  in  brief-contact  dinics  end  in  the  screening  practices  of  medical  centers  where  heavy  case 
loads  and  faw  personnai  demand  rapid  survay  methods  for  neurological,  psychosomatic,  end  othar 
forms  of  neuropsychiatric  illness.  Schools,  courts,  business  and  industry,  panel  end  mental  institutions, 
public  opinion  poHs,  ail  have  recognized  the  need  for  rapid  devices  of  psychological  measurement.  Del 
(17)  early  pointed  out  tha  importance  of  economy  of  time  in  psychological  study;  Hunt  et  al  (45)  have 
emphasized  tho  economic  factors  of  "mechanics,  manpower,  and  time;"  and  Bobbitt,  WechsJer,  and 
others  (65i  also  have  reported  the  need  for  abbreviated  methods  of  psychological  measurement. 

Officers  of  the  Navy  also  have  been  long  concerned  with  the  need  for  abbreviated  psychologi- 
cal tests,  as  noted  :n  Louttit's  (57)  historical  review  of  psychological  examining  in  the  Navy.  A sympo- 
sium on  intelligence  tests  reported  in  the  U.  S.  N.  Medical  BuXetin  of  1915  (47,  83.  S6:  941  summarised 
adaptations  et  tha  Binet  scale  which  had  been  tried  out  since  I9S2  by  saver j Navy  medical  officer;. 
They  reports'!  tho  advantages  and  RfnftsKwM  of  I he  modified  tests,  end  one  of  tho  writers,  G.  E. 
Themes,  presented  requirements  for  tests  which  reed  like  a 1952  study  rather  than  one  nearly  40  years 
old.  After  e year  and  * half  of  tha  adaptation  and  trial  of  the  Binet  scale  at  Portsmouth  naval  prison, 
I homes  wrote: 

There  has  been  much  discussion  by  psychologists  outside  tho  Navy  *nd  by  some  of 
the  medical  officers  in  the  service,  of  the  value  of  the  Binet  system  as  a means  to  determine 
the  mentality  of  the  recruit ...  If  a mental  test  is  to  be  applied  in  the  Navy  it  should  bs  de- 
vised for  the  recruiting  officer  and  it  &houid  answer  the  following  requirements:  S.  It  should 
be  fair  in  fts  requirements,  and  a definite  minimum  passing  mark  established.  2.  It  should  be 
sufficiently  varied  to  make  evident  the  imSUige^to,  education,  and  training.  3.  It  should  be 
so  devised  that  but  slight,  if  any,  variations  are  possible  in  the  results  of  the  different  examin- 
ers. 4.  It  shcgld  not  consume  much  time.  (94) 

Tha  qualities  of  "cutting  score,"  renga,  objectivity,  end  economy  of  time  are  described  here, 
and  to  them  Jenkins,  another  of  the  symposium  participants,  added  (47)  tho  requirement:  "It  can  be 
epplied  by  any  intelligent  person  aftor  a little  training/’ 

RATIONALE  AND  PROBLEMS 

In  1946,  Hunt  and  Stevenson  (44)  summarized  important  considerations  underlying  the  rationale 
of  abbreviated  tests  in  noting  that  "...  changes  in  efficiency  must  be  evaluated  in  the  light  of  the 
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demands  of  each  separate  screening  situation,  end  an  Sncreaee  in  brevity  5*  often  worth  the  slight  de- 
crease in  tost  efficiency  that  it  entails,  in  military  testing-  the  value  of  any  test  cannot  be  determined 
by  fixed  ar.d  absolute  standards.  Value  S*  a relative  matter  determined  by  the  economic  factors  in- 
volved." In  the  development  of  new  brief  methods  of  psychological  examination  or  in  the  abbreviation 
of  previous,  longer  forms,  not  only  must  the  criteria  specific  to  short  forms  be  satisfied  but  also  those 
for  the  usual-length  test  (65).  For  example,  reliability  end  validity  ere  just  as  real  problems  in  abbre- 
viated tests  as  in  longer  ones.  Moreover,  because  of  fewer  items  end  shorter  forms,  reliability  m*y  da- 
crease  as  pointed  nut  by  Symonds  (vii,  CetteS  (50k  Wallen  (100),  and  others.  However,  there  ere 
studies  such  as  that  of  Brokaw  (8)  in  w*:ch  the  reliability  of  e battery  of  6 tests  for  classifying  Air  Force 
personnel  for  technical  training  changed  only  from  .95  to  .90  when  abbreviated  by  50  per  cent,  end 
the  validity  of  the  battery  changed  ooly  from  .57  to  .56  after  abbreviation. 

The  genere!  problem  of  tss*  criteria  recently  has  been  reexamined  end  reported  (3)  by  the 
American  Psychological  Association's  Committee  on  Test  Standards.  Standards  of  professional  judg- 
ment in  selecting  and  interpreting  tests  are  presented,  sheering  the  need  for  "sufficient  information 
about  a test  so  that  users  wit!  know  whet  reliance  cen  be  safely  placed  on  it."  Particularly  relevant  is 
the  Committee's  statement  that  "somewhat  different  standards  should  be  stressed  for  different  types 
of  tests  and  not  aN  types  of  information  ere  equally  crucial."  This  professional  group  judgment  coin- 
cides with  Hunt  end  Stevenscn's  earlier  comment  on  the  rationale  of  abbreviated  tests  (vide  supra). 
Further,  the  Committee's  definition  of  the  scope,  of  the  standards  presented  in  their  report  «—  *The 
present  standard!  apply  to  tests  which  are  distributed,  for  use  as  a basis  for  practical  judgments  rather 
then  solely  for  research"  — applies  directly  to  situations  obtaining  In  military  and  naval  screening 
procedures  where  practical  juogments  must  be  made  continually  on  the  acceptance,  acceptance  with 
qualification,  or  refaction  of  recruits.  In  the  committee  report,  standards  ere  given  in  terms  of  derived 
level  of  information  about  interpretation  of  tests  (purposes  and  applications  for  which  the  test  is  rec- 
ommended, professional  qualifications  required  to  .administer  end  interpret  the  test,  date  to  be  taken 
into  account  other  then  test  score*},  validity  (type  of  validity  — predictive,  status,  content,  congruent 

— end  statistical  analysis;  validation*!  groups  comparable  to  sample*  for  whom  tact  is  designed;  cri- 
terion adequacy),  reliability  (coefficients  of  internal  consistency,  equivalence,  and  st*b$ty),  administra- 
tion end  scoring,  and  scales  end  norms  (percentiles  end  standard  scores,  appropriate  not*  of  norms, 
definition  of  normative  samples). 

These  standards  apply  equally  well  for  abbreviated  tests.  Already  noted  is  the  problem  of  re- 
liability based  on  internal  consistency  when  « parent  test  is  shortenod.  In  a statistical  sense,  the  method 
of  abbreviation  operates  to  lower  reliability,  but  reliability  coefficients  of  equivalence  end  of  stability 
mev  be  computed  as  alternate  forms  of  an  abbreviated  test  ore  developed  end  applied,  either  by  re- 
peated samplings,  by  use  of  the  test-retest  ritaetiur.,  or  in  cross-vaiidetional  studies  with  samples  other 
than  the  normative  groups  on  which  the  test  was  originally  evaluated.  The  other  criteria  of  tost  us* 

— interpretation,  validity,  scales  and  norms  — have  the  same  significance  for  the  abbreviated  test  as 
do  they  for  the  parent  test.  In  general  then,  standards  of  test  construction  end  use  ere  equely  the 
Province  of  parent  and  of  abbreviated  psychological  measures. 

Against  this  background  or  test  development  and  use  there  appear  several  principal  problems 
in  abbreviated  testing.  Menth  (65)  has  reviewed  these  in  summarizing  studies  of  the  effects  of  prac- 
tice, whether  termed  "experiential  factor,"  "warm-up,"  transfer  or  "functional  transfer,"  fatigue,  cr 
work  decrement;  effects  of  "filler  material'.'  or  "dead  wood"  Herns  in  long  tests;  effects  of  contextual 
changes;  rapport  and  motivation  in  abbreviated  testing;  examiner  differences;  and  order  of  Hern  dif- 
ficulty and  of  administration  of  the  abbreviated  test  within  a battery  of  tests.  Practice  effects  have 
long  been  the  concern  of  psychologists  end  have  been  shown  to  be  a function  of  various  factors; 
"dead  wood"  Heim  argue  for  shorter  tests  but  Hurft,  Conrad  and  others  have  cautioned  that  a priori 
guesses  about  efficiency  must  be  replaced  by  actual  trial  of  items;  contextual,  set,  and  motivations’ 
factors  have  been  studied  by  Conrad  (12),.  Cronbach  (16),  Horst  (36),  McCall  (26),  Mensh  (64),  and 
Sears  (84)  among  others;  and  specific,  restricted  goals  in  abbreviated  testing  have  been  suggested  by 
Sriah&m  & -d  by  Deif  (7,  17)  early  in  the  history  of  mental  testing,  and  nr.cro  recently  by  Cotzin  and 
G«»iieg!ief  (14,  !5),  Hunt  and  Stevenson  (44),  Kent  (49,  50,  51),  Terman  end  Cden  (93),  Vernon  (98), 
Wondnriic  end  Hovland  (119),  and  Zubin  (112).  Typical  of  the  caution  of  these  investigator*  is  this 
comment:  "...  a highly  serviceable  measure  ...  Its  success  with  defectives  should  not  be  assumed 
for  other  clinical  groups  without  further  investigation."  (15) 

The  method  of  specific  goals  and  successive  testing  proved  its  efficiency  in  the  military  and 
naval  services  where  thousands  of  man  were  screened  by  brief  examinations  (25,  65,  i 12)  designed  for 
the  specific  purpose  of  discriminating  a defined  sample  (principally  two  groups  — mental  deficiency, 
or  personality  disturbance  serious  enough  to  interfere  with  adjustment  to  the  armed  services)  of  the 
population  under  test. 

Most  of  them  were  relatively  rough.  They  stood  up  well  in  terms  of  the  numbe-  of  the  desired 
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population  identified  (manta!  defective^  the  emotionally  unstable,  ate.)  but  felsely  identified  many 
normal  individual*  as  undesirable.  The  pick-up  or  correct  identification  rates  of  these  tests  ranged 
roughly  from  60  to  98  per  cost  of  the  population  to  be  identified.  The  false  positive  rates,  or  number 
of  desirables  incorrectly  identified  as  undesirable,  however,  ranged  roughly  from  S tc  25  per  sent  of 
those  tested.  The  final  decision  as  to  the  use  of  the  test  always  depended  nnf  «dy  upon  +Ke  efficien- 
cy of  performance  in  these  terms;  but  upon  such  economic  factors  as  time  and  manpower  required, 
whether  or  not  a better  technique  ware  available,  and  tha  purpose  for  which  the  test  ware  used. 

Thus  a military  screening  unit  that  had  neither  time  nor  personnel  available  for  carefully  inter- 
viewing and  examining  all  tha  incoming  recruits  might  use  as  a rough  praliminary  sieve  a test  which  had 
a pick-up  rata  of  85%  and  a false  positive  rete  of  25%  and  hold  for  further  examination  ell  men  "fail- 
ing" tha  test.  !n  such  a case,  out  of  1 ,000  men  to  bs  oxamined  the  test  might  select  200  for  furtfor  ex- 
amination. This  group  of  300  (containing  85%  of  tha  unsuitable!  it  was  desired  to  identify)  would 
then  be  subjected  to  a psychiatric  interview  end  further  testing  when  desireble  in  ordar  to  separate 
the  undesirables  from  the  false  positives.  Such  e procedure  would  result  in  « time  and  manpower  sav- 
ing of  60  to  70  par  cant  and  still  maintain  an  acceptable  screening  performance. 

The  consideration  of  such  economic  factors  and  tha  acceptance  of  limited  arid  specific  goals 
for  test  performance,  however,  operate  against  other  aspects  sf  test  efficiency;  and  extrema  caution 
must  be  used  in  both  the  clinical  and  practical  inferences  drawn  from  such  tasting.  This  caution  is  pre- 
sent in  Doll's  earlier  insights.  After  indicating  the  advantages  of  abbreviated  scales  of  intelligence,  he 
objectively  balanced  them  against  limitations: 

...  It  may  be  advisable  to  emphasize  soma  of  the  limitations  of  the  brief  scale  as  wall 
as  its  advantages.  Equivalence  in  mental  age  rating  must  not  be  misconstrued  as  meaning 
complete  psychologicel  or  clinical  equivalence.  Neither  may  one  forget  that  a mental  eg. 
rating  does  not  in  itself  alone  furnish  a sufficient  means  of  mental  diagnosis  or  determina- 
tions o?  feeblemindedness.  The  more  complete  measuring  scales  of  inte&gence  furnish  a 
much  greater  variety  of  standard  situations  in  which  the  subject  may  be  caused  to  display 
his  mental  abilities  to  the  trained  observer.  Moreover,  tha  results  of  the  more  extenders  ex- 
amination ere  more  satisfactory  by  reason  of  the  more  elaborate  consideration  of  more 
phases  of  the  subject's  intelligence  and  rule  cut  the  possibility  of  invalidation  due  to  excep- 
tional circumstances  of  environment  or  education.  (17) 

Brigham  referred  (7)  to  Binat’s  explanation  of  the  reason  for  having  a series  of  test  to  measure 
intelligence,  rather  than  a single  test.  This  argues  against  abbreviated  testing  but  must  bs  consider- 
ed against  the  background  of  the  hypothesis  suggested  by  Brigham  and  successfully  put  to  tost  by 
Doll  — that  an  efficient,  brief  socle  could  be  developed  from  e longer  one  by  using  ttiose  items  and 
tests  which  discriminate  against  some  sample  of  the  population,  ir.  Doll's  case,  the  mentally  defec- 
tive. Binet's  reasoning  is  consistent  with  the  reasoning  of  a number  of  experimenters  with  abbreviated 
toeing,  c.g.,  Wondertic  and  Hoviand,  whose  brief  form  of  the  C«i*  Self-Administering  Test  (1 10)  in- 
eluded  a number  of  Herns  distributed  uniformly  over  the  range  of  difficulty.  In  only  one  instance  have 
single-item  tests  been  devised  (34)  and  these  were  for  a specific  purpose  with  a defined  population 
sample. 

The  inherent  nature  of  abbreviated  tests  places  a iim’itation  on  the  level  of  reliability  of  the 
measures.  Among  factors  affecting  reliability  :s  the  significant  one  of  number  of  items.  In  general, 
tests  a^e  more  reliable  if  the  number  of  item*  is  large  (91),  and  Cattail  has  framed  the  question  spe- 
cifically: "Is  it  possible  to  cut  down  a test  much  below  one  hour  end  still  get  a measure  of  sufficient 
consistency  (reliability)  — not  to  mention  validity  - - to  be  used  as  a basis  fer  decisions  affecting  the 
individual's  whole  career?"  (10).  l.orge,  too,  has  decried  (56)  the  ' tendency  tc  use  short  tests  with 
out  adequate  consideration  of  reliability  or  of  consistency  ..."  Yet  Doll's  study  of  the  Binet  Simon 
Scale  showed  that  tho  item  intercom clarion,  were  so  high  that  more  than  half  (3  of  5 at  each  age  le.  el) 
of  the  tests  could  be  omitte'1  without  affecting  reliability  e*  the  mental  ages  obtained  (17).  Also, 
Lawshe  &rd  Mayer  (54)  found  that  brief  tests  or  20,  40,  60,  80,  and  100  items  could  be  selected  from 
among  300  items  with  reliability  as  high  or  higher  then  the  long  form.  And,  with  respect  to  velidify, 
in  a study  of  800  Army  inductees  end  625  Army  prisoners  Aitus  (2)  concluded  that  "the  validity  of  a 
test  is  not  entirely  a function  of  its  length  ...  it  is  possible  by  careful  item  selection  to  reduce  a test 
to  as  few  as  13  question:  end  still  retain  a fairly  good  approximate  measure  of  verbal  intelligence  . . ." 
However,  Aitus  cautioned  that  such  approximation  should  be  used  only  where  there  is  a time  premium* 
permitting  only  one  or  two  minutes. 

Other  limitations  have  been  recognized  by  Hunt  and  Stevenson  in  their  statement  of  "three 
common  arguments  against  tho  use  of  shorter  forms,  first,  they  do  not  offer  the  fineness  of  discrimina- 
tory measure  that  the  long  tests  do.  Second,  they  dc  not  offer  the  same  richness  of  diagnostic  possibili- 
ties, i.e.,  in  the  analysis  of  scatter.  Third,  their  use  demands  more  clinical  background  and  skill  on  the 
part  of  the  axaminar.  There  is  truth  in  all  these  arguments,  but  they  are  not  as  conclusive  as  they 
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ssem"  (44).  Wo  airaady  hava  teen  that  brief  tests  can  be  efficient  when  designed  for  a specific  dis- 
i— inatery  function,  ar.d  that  screening  procedures  imply  that  selection  it  a sitting  process  (44,  i 12). 
Abo,  recant  studies  of  Hunt  and  hi*  co-worten  (37-43,  52.  63,  ?3)  have  demonstrated  the  cfimea!  diag- 
nostic possibilities  of  abbreviated  testa. 

In  a comparison  of  various  Wachtier-BeHevua^  abbreviations,  Patterson  examined  entice  By  the 
limitations  of  short  testa  of  intelligence  end  personality.  Ha  recognised  the  necessity  for  developing 
and  using  brief  tests,  used  end  evaluated  various  measures  himself  but  pointed  out  certain  dangers 
and  was  concerned  with  "'undue  emphasis  in  clinical  psychology"  on  the  trend  toward  shorter  tests. 

. . . Mead,  the  need  is  for  more  good  comprehensive  tests,  rather  then  shorter  forms. 

For  clinical  use.  it  would  appear  that  an  hour  or  more  it  not  too  exorbitant  a time  for  deter- 
mining the  patterning  of  intellectual  functioning,  for  example.  Lest  than  this  amount  of  time 
decreases  the  reliability  of  the  sample  ox  ttie  subject's  behavior,  and  reduces  the  aspects  of 
functioning  that  can  bo  obsarved  and  tasted.  As  a result,  a short  test  not  only  gives  an  in- 
complete picture  of  the  subject's  abilities,  but  often  an  unreliabi*  picture.  Moreover,  the 
limitation  of  tasting  to  one  or  a vary  few  functions  or  aspects  of  behavior  prohibits  the 
comparison  of  the  subject's  functioning  in  various  areas  . . . (72) 

In  summary  than,  economy  in  both  subject  and  examiner  time,  in  equipment,  and  in  personnel 
dominates  the  motivation  behind  abbreviated  measures.  Economy  in  the, a areas,  however,  does  not 
permit  the  criteria  of  good  *S3t  construction  to  be  overlooked.  Thus  the  vital  problems  of  reliability 
and  validity  are  as  central  to  abbreviated  techniques  as  to  longer  forms  of  psychological  tests.  Other 
problems  also  must  be  considered  — practice,  "warm-up,"  transfer,  "expereirnal"  effects  in  brief  test- 
ing as  weN  as  in  standard-lenqth  tests;  how  can  "dead  wood"  end  "fiifw  inatarul"  ba  best  !cca fixed 
and  eliminated  to  produce  efficient  brief  measures;  the  specific  goals  and  functions  of  abbreviated 
techniques;  the  roio  of  examiner  differences  in  (he  use  of  short  tests;  and  the  significance  of  set,  mo- 
tivational. and  contextual  factors  which  may  change  as  a function  of  test  abbreviation.  The  limitations 
of  abbreviated  testing  are  rsKectod  in  those  many  factors.  As  Hunt,  Conrad,  and  others  hava  pointed 
out,  only  experimentation  can  held  the  answer  to  these  problems.  Some  of  the  answers  now  ere  avail- 
able through  recant  experimental  studies.  Thee*  show  the  premiss  of  brief  psychological  measures 
which  have  served  a useful  function  in  meeting  the  need  for  psychological  evck’.ition. 


AVAILABLE  ABBREVIATED  PSYCHOLOGICAL  TECHNIQUES 

The  experimentation  within  the  recent  war  and  post-war  periods  has  produced  s numrvw  of 
abbreviated  psychological  techniques,  some  of  which  are  sub-test  selections,  e.g.,  vocabulary  and 
other  measures  (17.  20,  40,  95,  96)  from  the  parent  test;  others  am  item  selections,  as  from  the  Min- 
nesota Muhiph»«ic  Personality  Test  (2S,  92);  stiH  others  are  inspection  methods  as  Munroe's  technique 
(68-70)  with  the  Rorschach  test;  screening  devices  of  which  the.  Sasiow  symptom  index  (22,  £2)  is  a 
sample;  and  spacially  davisad  techniques  such  ««  the  Kent  E-G-Y  serie.*,  (49-51). 

Abbreviated  psychological  measures  span  the  entire  range  of  fast  materials  and  roefhods. 
Thar*  are  brief  tests  for  adjustment  (1,2,  76j.  alcoholic  addiction  (62),  anxiety  (25,  it,  ?2,  105), 
aphasia  (27,  46),  controlled  association  (61).  feeling  and  attitude  (35),  food  aversions  (101,  1021,  mem- 
o.-y  function  (21,  89,  104),  mental  deficiency  (40,  48,  90),  myokinetic  and  autekinetic  response  (85,  87, 
99),  neuroficism  (19),  optimism-pessimism  (ilj,  psychiatric  prognosis  (59),  psychosomatic  disturbance 
(67,  106,  107,  109),  public  opinion  (79-81),  reaction  time  (77),  time  appreciation  (9),  visual-motor  func- 
tion (4,  5,  6,  23,  24,  55,  58,  III),  and  vocabulary  ( i 3,  95,  96),  among  other*.  Samples  studied  ran-?* 
from  childhood  through  old  age,  and  from  "normal"  throughout  the  spectrum  of  behavior  pathology. 

Intelligence  measures.  The  extensive  use  of  the  Wechsler-Bellevue  intelligence  Scab  has  wrrVu 
as  stimulus  for  use  of  this  test  as  a parent  form  from  which  many  abbreviated  tests  have  been  selected. 
Recenfiy  (hare  has  been  a review  of  research  with  the  W-B  Test  for  the  years  1945-50  by  Rabin  and 
Guertin  (75)  in  which  shorter  forms  are  discussed.  Prior  to  this  review  are  (how  of  Rabin  in  1945  (74). 
end  Watson  in  1946  (103).  In  these  three  reviews  nearly  200  studies  ere  uimmanzeo,  or  wuicn  eoout 
one-fifth  are  with  aboreviated  forms.  There  also  have  been  about  40  studies  which  report  perform- 
ance of  abbreviated  forms  of  the  Stanford-Binet  Intelligence  Scale.  Together,  these  two  tests  have 
served  es  parent  forms  in  nearly  90  studies,  more  than  a third  of  the  reports  on  test  abbreviation  pub- 
lished in  psychological  literature  to  date.  A third  test,  the  Kent  Oral  Emergency  Test  ;49-El),  dif- 
ferent from  the' W-B  and  S-8  tests  in  that  it  was  devised  as  a brief  test,  has  «tirnuUt»,<J  about  30  stud- 
ies, serving  either  as  criterion  or  as  experimental  test. 

A comprehensive  study  of  the  clinical  usefulness  of  abbreviated  intelligence  tests  (37-43)  has 
been  carried  out  by  rhe  authors  and  their  colleagues.  Shortly  after  the  dose  of  World  War  II,  a brief 
test  battery  was  developed,  consisting  of  the  Comprehension  and  Similarities  sub-tests  from  the  Wechs- 
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ler-Bellevue  Intelligence  Scale,  Form  5,  and  Thorndike's  15-word  vocabulary  scale  {95}  falcon  from  fha 
Stanford-Binef  vocabulary  fast.  Known  as  fha  CVS  Individual  Ir.tsKigsnca  Scale,  if  was  saiscfad  for 
extensive  investigation  both  because  of  its  correlation  with  external  criteria  of  intelligence  and  far  its 
diagnostic  potent:;: it*/.  With  a samsi*  of  1,649  Naval  recruits  (40)  a correlation  of  .10  was  obtained 
between  CVS  and  the  Navy  General  Classification  Tact  (GCT).  Reliability  hais  boon  largely  inferred 
cn  the  basis  of  consistant  validity  in  tha  testing  of  separate  samples,  but  a retasting  of  116  mart;! 
defectives  (40)  after  an  interval  of  one  year  gave  a reliability  coefficient  of  .81.  In  view  of  the  limited 
range  of  intelligence  in  the  sample  this  can  do  considered  satisfactory.  A series  of  studies  of  tha  CVS 
Scale  {40}  with  large  numbers  of  novel  recruits,  and  with  repeated  samples  of  dinical  populations  has 
demonstrated  the  dinical  usefulness  of  this  brief  battery.  The  psychological  literature  now  includes 
CVS  data  on  samples  of  normal  138,  40,  52),  mentally  defective  (39,  52),  brain-dam Aged  (39,  63),  and 
psychotic  (39,  52J  subject:.  The  CVS  Scale  represents  a brief  verbal  scale  which  can  be  memorised  by 
the  dinician,  does  not  involve  test  equipment  other  than  a record  form  of  a single  page,  correlates 
significantly  wHh  external  criteria  of  intelligence,  and  has  potentialities  as  a rough  diagnostic  screen 
for  indicating  possible  psychological  disturbance.  For  situations  whore  non-verbal  material  is  indicat- 
ed, there  are  several  brier,  individual  intelligence  scales  which  have  been  evelueted  (38,  39)  by  Hunt 
end  French.  Ail  combine  vocabulary  with  nonverbal  materials,  correlate  significantly  (.69-.83)  with  both 
SC7  and  CVS,  and  nave  differentiated  dinical  samples  of  schizophrenics  and  mental  defectives  from 
normals.  The  goal  of  satisfactory  diagnostic  differ; demands  cross-validation  with  further 
samples  and  a more  extended  list  of  dinicai  disorders. 


Personality  inventories.  The  history  of  the  personality  inventory  as  a rapid  screening  method  il- 
lustrates how  the  basic  pattern  of  this  technique,  laid  down  35  years  ego,  has  remained  vmc  hang  ad. 
Zubin  reports  a personal  communication  from  Woodworth  in  which  there  is  the  hiitery  c*f  the  firrt 
screening  device  to  be  used  by  tha  military  (H2).  Woodworth  had  been  appointed  by  the  American 
Psychological  Association  in  April  of  19!?  to  cnair  £ Committee  on  Emotional  Fitness  for  Warfare. 

Woodworth  and  Poffanbergar  worked  assiduously  on  this  problem  at  Columbia  and 
aftvf  trying  out  various  tests  "hit  upon  tha  idea  of  assembling  minor  neurotic  symptoms,  as 
found  by  psychiatrists  in  tha  case  nistories  of  individuals  who  lator  developed  neuroses  or 
psychoses,  and  tallying  up  the  score  of  positive  answers  . . . intended  as  a screening  device 
with  primary  use  of  the  quantitative  score,  but  ako  with  attention  to  certain  'starred  ques- 
tions’ which  the  psychiatrists  . , . believed  would  bo  of  significance  quite  apart  from  tha  tote! 
score.” 

A comprehensive  review  by  Blit  and  Conrad  (18)  of  tha  military  applications  of  personality  in- 
ventories, many  of  them  brief  methods,  has  yielded  e number  of  conclusions  about  factor:  respon- 
sible for  the  favorable  results  in  military  practice  and  tho  disappointing  findings  in  civilian  practise. 
After  examining  studies  of  military  personnel  by  inventories  maxing  use  of  a psychiatric  criterion  {prog- 
nosis or  diagnosis  of  neuropsvehiatrie  unfitnau  far  military  duty),  the  authors  concluded  that  certain 
factors  appear  to  have  played  a part  in  the  results  obtained.  These  factors  were  criterion  contamina- 
tion and  overlap,  use  of  extreme  or  atypical  groups;  dSffefentU!  motivation,  inadequate  statistic;! 
treatment  of  data,  lenient  evaluation  of  f,’{«ka-po*rHve"  results  and  neglect  of  "fake-negative”  cases, 
sample  heterogeneity,  lover  intelligence  or  greater  naivete  of  military  subjects  with  "'lass  distortion" 
of  responses  than  among  civilians,  specialized  radian  and  validation,  and  application  "for  screening 
only,  and  not  for  elaborate  personality  analysis-" 

In  studies  making  use  of  a performance  criterion,  as  success  in  a training  course,  prediction  was 
much  less  effective  then  with  a psychiatric  criterion.  Ellis  and  Conrad  attribute  the  difference  to  prior 
elimination  of  abnormek  >n  selection  for  training  courses,  lack  of  reliability  or  validity  of  the  perform- 
ance measures,  differences  in  aptitude  and  previous  training  rather,  than  differences  in  emotional  ad- 
justment, and  shift  of  criterion  from  vaiidatlcr  in  farms  of  the  psychiartic  criterion  in  the  original  stan- 
dardization, to  validation  in  terms  of  performance  measures,  they  state: 

1.  Personality  questionnaires  should  be  especially  designed  for  the  group  to  whom 
they  are  applied,  and  should  be  validated  against  dependable  external  criteria.  Criterion- 
contamination  should  be  guarded  against;  and  criterion-overlap,  if  H occurs,  should  be  taken 
into  account  in  evaluating  the  findings. 

2.  Special  attention  should  bo  given  to  persuading  or  inducing  respondents  to  ans- 
wer the  inventory  items  as  truthfully  as  thev  can. 

3.  Personality  inventories  may  possibly  be  more  affective  when  used  with  relatively 
uneducated  end  less  intelligent  groups,  than  with  groups  that  are  more  sophisticated. 

4.  Tha  users  of  personality  inventories  should  realise  that  only  limited  and  specialized 
demands  may  be  made  on  the  inventory  technique;  end  that  breed  end  incisive  personality 
diagnosis  is  stiii  the  specialty  of  the  trained  clinician  employing  subtler  and  more  compre- 
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bensive  psychological  techniques.  (18) 

At  th«;  1947  Maryland  Conference  on  Military  P:yehologv,  Waxier  tummirind  the  Ncvy's  ex- 
perience with  psychiatric  screening  tests  in  the  following  terms? 

Prrheps  the  best  way  to  summarise  our  experience  would  be  to  suggest,  as  a poten- 
tially valuable  research  instrument,  an  inventor*  with  severel  broadly  diagnostic  scales,  naving 
item:  in  paired -choice  form,  with  e "seff-ideel'ietion"  tcele  as  a possible  corrective  for  scores 
on  the  units  directed  toward  the  measurement  of  maladjustment,  end  finally,  with  a separate 
inquiry,  perhaps  biographical  or  ettrtudinri,  into  defensive  or  integrating  elements  which 
might  serve  to  counterbalance  end  negate  the  total  picture  of  disturbance.  (108) 

Projective  techniques.  During  the  past  18  years  there  has  been  great  emphasis  upon  project- 
ire  methods,  as  distinguished  from  structured  intelligence  end  other  diagnostic  devices,  end  personal- 
ity inventories  whose  data  are  treated  in  traditional  psychometric  fashion.  It  would  be  expected  there- 
fore, that  demands  for  abbreviated  psychological  measures  also  would  include  the  projective  techniques. 
At  with  the  other  two  principal  typos  of  psychological  methods  there  are  now  eveilaUo  in  the  project- 
ive field  both  abbreviated  forms  of  parent  tests  and  brief  tests  specifically  designed  for  rapid  evalua- 
tion. 

The  most  widely  used  method,  the  Rorschach  technique,  has  been  modified  by  group  adminis- 
tration, multiple-choice  selection  of  responses,  decrease  in  the  standard  number  of  stimulus  cards  pre- 
sented, and  rapid  methods  of  scoring  and  interpretation.  During  World  War  It,  Hertz  (33)  streamlined 
the  8-10  hours  of  administration,  scoring,  and  interpretation  with  the  standard  methods,  to  50  minutes 
of  administration  time  using  summary  sheets  and  check  lists  to  speed  interpretation.  Howover.  prob- 
lems of  reliability  and  validity  of  the  briefer  methods  are  as  major  (66)  as  they  are  in  use  of  the  stan- 
dard techniques.'  These  difficulties  also  are  found  in  the  Harrower-Erickson  screening  modification  (29- 
31}  in  which  multiple-choice  responses  are  introduced  rather  than  free  association,  thus  sharply  limiting 
the  range  of  response.  Zuckerman  (i  13)  suggested  further  modification  for  Urge-scab  Rorschach  test- 
ing, with  three  exposures  for  each  of  the  fen  stimulus  slides  — 20,  IS,  and  15-second  exposures,  res- 
pectively — and  ten  multiple-choice  items  per  exposure  to  L«  responded  to  on  IBM  answer  sheet?  and 
scored  by  stencil.  Munroe  (70)  has  reported  an  experiment  with  group  administration,  three  minutes 
per  card,  and  scoring  and  tabulation  in  a 20-minute  period  by  means  of  her  Inspection  Rorschach 
Check  List.  This  latter  device  (68-70)  represents  still  another  avenue  for  tha  abbreviation  of  tests, 
with  concentration  «*»  shortening  significantly  the  time  required  for  scoring  and  tabulating  Rorschach 
data.  Munrce  (71)  supports  the  use  of  projective  methods  in  group  testing  in  her  comment  that  "the 
projective  methed  offers  a complex  specimen  of  spontaneous  action  even  when  administered  to  groups 
. . . where  current  individual  m&.nod*  adapted  to  group  use,  the  cjroup  tester  for  the  first  time  can 
approach  ih*  problem.  sf  evaluation  with  somethinq  ■**'  ’csourcetulness  and  knowledge  available  to 
the  di nicer*  working  with  similar  individual  methods. 

Another  principal  proiectiva  technique,  often  used  in  conjunction  with  the  Rorschach  method,  is 
the  Thematic  Apperception  Test  and  this  also  has  been  modified  in  both  administration  and  scoring  in 
order  to  reduce  the  time  factor.  The  use  of  slides  and  a reduced  rumher  e?  card:  have  been 

exporini’jiiicJ  with  as  methods  of  economy  in  administration.  Harrison  and  Rotter  (28)  used  5 slides 
in  3Q  vjcond  exposures  with  7Vi  minutes  allowed  for  each  response  period;  and  Smitn,  Brown,  end 
Thrower  186)  used  8 cards  of  the  TAT  series  as  an  aid  in  history-taking,  diagnosis,  and  treatment  situa- 
tions in  the  neuropsycniatfic  clinic  of  a general  hospital. 

In  addition  to  the  Rorschach  ana  Thematic  Apperception  Test  abbreviations  there  are  a num- 
ber of  other  projective  techniques  which  require  relatively  brie?  periods  of  •dmi’yiiretier!.  These  in- 
dude among  others  Mira's  myoVinetic  psychodiac  nosis  (87),  Bender's  visual  motor  gestalt  test  (4),  the 
Geosign  Test  (76),  the  graphomotor  proj«r*iva  *•  Unique  (53),  van  Lennsp's  Four-Picture  Test  (97j, 
Machover’s  Draw  A Person  Test  (60),  and  word  association  and  sentence  completion  (78)  techniques. 


SUMMARY 


Tnc  history  of  abbreviated  psychological  measuromenv  erfv nui  back  in©  pass  **v  years, 
beginning  with  the  efforts  of  medical  officers  of  the  U.5.  Navy  to  adapt  the  Binet  Scale  for  measur- 
ing Intelligence  to  selection  of  recruits.  Criteria  for  such  brief  techniques  were  formulated  at  that  time 
which  still  hold  for  present-day  testing,  covering  the  requirements  of  "cutting"  scores,  adequato 
range,  objectivity,  economy  of  time,  and  simplicity  of  administration  and  scoring.  These  pioneers  in 
brief  psychological  measurement  also  were  aware  of  the  limitations  of  the  methods.  Continuing  con- 
cern arising  from  exoerimental  evidence  has  indicated  caution  in  their  use. 

World  War  li  gave  the  major  impetut  to  abbreviated  tests  and  the  present  emergency  and 
manpower  mobilisation  problems  again  have  stimulated  interest  in  the  development  and  validation  of 


rapid,  objective  method)  for  reurcpsychiatric  screening.  Thar®  now  ara  available  in  the  osyidielugic 
and  psychiatric  literature  about  300  report)  on  abbreviated  or  brief  psychological  test),  these  cover 
the  range  of  intelligence  and  other  diagnostic  measures,  personality  inventories,  end  projective  tech- 
niques; and  sample  populations  of  normal,  neurotic,  psychotic,  and  brain-damaged  individuals.  These 
many  studies  he  c attempted  to  meet  the  demands  tor  brief  psychological  methods  by  the  military 
and  naval  services,  hospitals,  clinics,  schools,  and  business  end  industry. 

Advantages  of  abbreviated  measures  lie  in  thair  economy  of  time  both  in  subject  and  exam- 
iner time,  in  elimination  cf  "deadweod"  and  "fillor"  items,  in  equipment,  and  in  trained  personnel. 
These  have  been  demonstrated  in  studies  nf  verbal  and  nonvarbal  fast  materials  whera  their  diagnos- 
tic usefulness  has  been  proven,  The  limitations  of  brief  measures  must  be  examined  in  terms  of  their 
specific  goals,  and  tha  significance  of  set,  motivational,  and  contextual  factors  which  may  change  as  a 
Function  of  tart  abbreviation.  In  conclusion  we  may  repeat  our  previous  quotation:  "...  changas  in 
efficiency  must  be  evaluated  in  the  light  of  the  demands  of  each  separata  screening  situation,  and  an 
:;:c;«ese  in  brevity  is  often  worth  the  slight  decrease  in  test  efficiency  that  ft  In  military  tatt- 

ing, the  value  of  any  test  cannot  be  determined  by  fixed  and  absolute  standards.  Valuo  is  a relative 
matter  determined  by  the  economic  factors  involved."  (40) 
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